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Abstract 

The fofding of a polypeptide is an example of the cooperative effects of the 

amino-acid residues. Of recent interest is how a secondary structure, such as 

a helix, spontaneously forms during the collapse of a peptide from an initial 

denatured state. The Monte Carlo implementation of a recent helix-forming 

model enables us to study the entire folding process dynamically. As shown 

by the computer simulations, the foldability and helical propagation are both 

strongly correlated to the nucleation properties of the sequence. 
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Attempting to understand the complex functional nature of proteins is one of the most 
challenging problems in molecular biology. In the past 10 years, considerable effort has been 
made to show that these molecules are far from atypical polymer chains made of disor- 
dered amino-acids. Despite the seemingly disordered nature of the sequence, every protein 
possesses some remarkably similar basic characteristics. Much of the current knowledge 
has been derived from computer simulations of simplified protein models One of the 

ultimate goals of theoretical modeling is to offer a base for quantitative comparison with 
experimental structural determinations. Therefore, it would be advantageous to generalize 
minimal models in an off-lattice, three-dimensional setting. For example, in an off-lattice 
Go-type model a heteropolymer with interactions between residues is constructed in 
such a way that the interaction matrix is chosen to yield the desired native state. @J^,0]. 

The essence of these lattice and off-lattice models is that protein structures are created 
out of heterogeneity of the sequence. There is no doubt that heterogeneity plays a dominant 
role in structure selection; however, secondary structures are known to originate from a 
number of other important effects such as hydrogen bonding. This suggests that additional 
considerations should be made in theoretical models in order to capture structural and 
dynamical properties that go beyond the heterogeneity consideration in a sequence. 

In a recent Letter, we have stressed the need to include a directional biased residue- 
residue potential energy in order to design a significantly ordered native state, using the 
helical structure as an example ||. In particular, we have shown that an almost perfect 
helical native structure could be produced from a homopolymer backbone with a square- 
well potential that prefers parallel bond angle planes (Fig. |]). This preference, written in a 
very compact mathematical form, can be thought of as a simple adaptation of much more 
complex hydrogen-bonding and dipole potential energies @-|n|. 

In this Letter, we present the folding dynamics study of helix-forming polymers based 
on this model. In a typical numerical experiment, a denatured initial configuration is well- 
equilibrated at high temperature (Fig. |2]a left), and is then quenched below the coil-helix 
transition temperature at T = 0.6e/kB, where e is the maximum attractive energy that two 
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monomers can attain when forming a bond. A Monte Carlo (MC) procedure is implemented 
after that point, and the entire chain begins to collapse to the ordered helical state (Fig. Qa 
right). The process allows us to examine the kinetics of domain growth of the ordered 
segments. 

The key features of the helical-forming model are kept the same as in our previous study; 
however, some minor changes have been implemented to tailor the model towards a more 
realistic helix folding experiment. When randomly chosen, the rotation of a monomer about 
the axis defined by the two nearest neighbors is attempted according to the Metropolis rule 



The bond angles can now fluctuate slightly around 2tt/3 with an energy cost ||13|| , which 
maintains a worm-like backbone and allows for small local movements. The second change 
is the replacement of the square-well potential for the monomer-monomer interaction in our 
original work by the Lennard- Jones model which allows for a more smooth dynamic motion 



when two monomers interact []14|| . Our last change breaks the symmetry between left and 
right handed helices. The vector u described in Fig. [I] is tilted to make a correct right hand 
bonding in a helical state after accounting for the pitch of the helix [fL5| . 

In the first set of numerical experiments (Model I), chains with number of residues 
N = 19,25,31,37,43, and 49 were considered. With each N, forty folding events were 
performed with the allowed maximum folding time i max listed in Table 1. Fig. ^|a is a time 
lapse image of a typical folding event (N = 49), where the left plot corresponds to an initial 
configuration, equilibrated at T = 00, and the last plot a completely folded state. From the 
observation of these folding events, three important features emerge. 

(1) For shorter chains (N < 31) there is no clear indication of a preferred nucleation 
site; the entire chain folds directly to the ordered helix state. Almost all of the events that 
did not fold into the preferred helical state acquire intermediate, poorly wrapped globular 
states. 

(2) For longer chains, the entire folding event accompanies a nucleation propagation 
process, an expected mechanism for a cooperative system. As demonstrated in Fig. |2]a, the 
nucleation starts at the ends of the chain and gradually heads towards the center, which is 



consistent with the different mobility of residues along the chain. The terminal residues are 
clearly more mobile due to reduced confinement restrictions on the movement. Once the 
nucleation of the end of the chain occurs, the helices begin to propagate inwards. At this 
point, an interface between the two domains of helices might form. This interface would 
eventually dissolve in favor of a single uniform helix along the entire chain. In comparison 
with the initial helical segmental formation which corresponds only to a small fraction of 
the net folding time, resolving the discontinuity is much slower and requires transverse 
fluctuations, which are limited by the helical confining geometry. 

(3) It is thus desirable to define order parameters which can be used to quantitatively 
describe the kinetics of the helix formation in our numerical experiments. For this model, the 
e vector, defined relative to the vector u as noted in Ref. 15 to correct for the pitch and the 
right handedness of the helix, offers a unique direction for describing orientational ordering. 
To study the local correlation of the bond orientation, we define the order parameter, H\ = 
(X^i 1 • e i+ i)/(N — 1). In a chain with a final global helical structure, all the e vectors 
are more or less aligned so that this order parameter approaches unity. For a chain with 
helix domains as shown in some of the snapshots in Fig. [2]a, this order parameter would also 
yield a high value, although the two domains might have helical axes pointing in different 
directions. Thus if i is a good measure of local helical content in a chain, but a poor measure 
of global ordering. To study the global ordering in the system, we define a second order 
parameter, H 2 = Y^Li /N. This order parameter deviates from unity in fractured helix 
with opposite helical directions and approaches unity in a global helical state. 

Typically, in folding studies of minimal models of proteins an understanding of the folding 
properties can be obtained from a study of the so-called "first passage" times @]^,|1(J , defined 
as the time required for a molecule to first enter the native state when started in an arbitrary 
configuration. Here, when both order parameters reach a value greater than 0.95 the segment 
is regarded as helical, specified to ensure that the segments are near perfect helices. The 
mean first passage times (MFPTs) shown in Table 1. 

Good folding proteins have MFPTs that obey a power-law behavior when scaled with 



system size |L6 . 



tmfp ~ N X , (1) 

where t m f p is the averaged first passage time, and A is a characteristic exponent. The 
exponent A varies depending on how the sequences were designed. For example, random 
sequenced chains scaled with A ~ 6, and sequences designed from a Miyazawa and Jernigan 



jl7l potential scaled with A = 4.5 ||16|| . It has been observed that sequences designed from 
protein-like potentials were better folders (had smaller A). 

Using the data from Table 1, we can determine A associated with our helical model by 
examining the data on a log- log plot (Fig. |3|). Fitting the data with a least-square method 
to Eq. [1] we obtained a value of Ai = 3.7(2). Our model demonstrates the characteristics 
of a well designed protein sequence, not a surprising conclusion since we know that helices 
exist in real proteins. What makes a crucial difference in comparison with previous results 
is that we are dealing with a homopolymer here, not a hetropolymer in other studies. 

The fact that helix nucleation starts from the terminals rather than the center leads 
to a simple question: can the folding scenario of the entire chain be altered by designing 
a heteropolymer that contains monomers with different functionality? In particular, as 
our second set of numerical experiments, two segments of six monomers were attached to 
the original terminals the helix-forming chain (Model II). These new segments are neutral 
and only interact through an excluded volume interaction with no attractions. The native 
structure of the new chain is completely determined by the helical-forming segment, and 
the neutral terminal segments only display a partial random coil conformation. Now, the 
terminal residues of the attractive segment no longer have the high mobility and will have 
nearly the same likelihood of nucleation as the interior monomers. With the addition of the 
non-attractive segments, the chain is now considered helical if the attractive residues, not 
the added ones, meet the requirements stated above for helicity. 

A typical folding event is displayed in Fig. 0b in a series of time lapse plots, with the 
neutral residues represented in black. In contrast to our previous set of experiments, it is 



now more likely to form a single, central helical nucleation site, rather than the two-domain 
structure that we observed before. It would appear that adding the two neutral sections has 
a net effect of slowing down the dynamics of the segment, as we know that longer segment 
will have much slower dynamic response. However, a striking feature of adding neutral 
segments is that the MFPTs actually decrease dramatically in comparison with its isolated 
counterpart, as displayed in Table |. The reduction in ability of the terminal residues to 
nucleate causes a more uniform distribution of nucleation sites along the chain, and decreases 
the overall nucleation probability. This means that the initial nucleation is longer, but a 
nucleation site that already exists has a much longer time to propagate through the entire 
segment before a second nucleation site occurs. Thus, there is a significant reduction in the 
folding times because a discontinuity in the segment does not have to be resolved. Multiple 
nucleation sites can still occur, however, only 50% of chains can fold with a single nucleation 
region. In contrast, an isolated helical segment almost always folds with a discontinuity. 

We have made a log-log plot of the data in Fig. where the MFPTs are fitted to the 
same power law in Eq. [I]. The characteristic exponent A2 = 2.4(3) is significantly lower than 
the exponent found above (Ai = 3.7). The new A 2 demonstrates that the folding process is 
fundamentally different. Longer helices are more easily formed if the probability of seeding a 
segment is relatively small compared to the propagation time of the isolated helical segment. 

To observe the folding kinetics from yet another angle, we examine a third model (Model 
III) in which we attach only a single neutral segment to one end of the helix-forming chain. 
The folding times for this model are shown in Table | and the fitted A3 = 3.5(1.0). The 
dynamics of this type of segment is a combination of the two segments already discussed. 
The nucleation of the helical segment occurs at the free end of the chain, as in an isolated 
segment. Now, what differs is that this is likely to be the only nucleation site, thus reducing 
the probability of having to resolve a discontinuity; the propagation of the helical segment 
occurs through longitudinal fluctuations along the chain contour and is retarded by sharp 
changes in the chain contour. This slowing of the propagation provides an opportunity 
to generate a second nucleation site in the remaining segment which might give rise to a 
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discontinuity that retards the dynamics. Only approximately 15% of the chains now fold 
with a single nucleation site. 

Thus, A is sensitive to the probability of multiple nucleation sites. More than one nucle- 
ation site decrease the foldability of the segment by creating a discontinuity. In our previous 
work yil]J , the anisotropy of the potential was demonstrated to play a significant role in the 
foldability of a helical segment. Klimov and Thirumalai postulated that the foldability 



of a protein is related to the relative separations of the coil-globular and globular-folded 
transition through the parameter, a = Tf — Tg/Tg, where Tg is the coil-globular transition 
temperature and Tf the globular- helix transition temperature. We showed that decreasing 
the anisotropy increases the value of a in our model. 

To further explore the kinetc features of Model II above where m = 6 was used (high 
anisotropy), we design a new set of experiments using m = 2 (low anisotropy) The MFPTs 
were collected for the same helical segment lengths as shown in Table 1, with a Aa(m = 
2) = 3.1(5) which is higher than \2(m = 6) = 2.4. Comparing the data in Table 1 as well 
as the corresponding A's we conclude that there is indeed a reduction in the folding times 
due to the reduction in anisotropy. Results from our pervious work show that there is a 
globular state at high temperatures which becomes stable as the anisotropy is decreased. 
This increased stability will reduce the probability of nucleation and decrease the rate of 
helical propagation, thus accounts for the observed increase in A. 

In summary, we have shown for the first time that the folding times from the coil state 
to the helical state scales as a power-law with the system size, as expected for protein-like 
system. Both folding times and the scaling exponents for the system are altered as well when 
the nucleation probability is adjusted. Nucleation is not the only important factor in folding, 
and the rate of propagation to nucleation is a dominating factor in fast folding characteristics 
of a helical segment. These results demonstrate the significance of "hot" sites, or conserved 
residues ||18|| , within a protein. These sites are the key nucleation regions of the folding 
process and important for the creation the native state. As we have demonstrated in these 
simulations, it is important to creat a dominate nucleation to ensure that propagation can 



proceed throughout the entire chain without an alternate nucleation site forming; multiple 
"hot" sites would prolong the folding time if they are formed too early in the folding. 
We would like to thank NSERC for the financial support of this work. 
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FIGURES 

FIG. 1. Interaction between two residues labeled i and j. The u vector is nomal to the bond 
plan. A modified Lennard-Jones interaction is assumed with the modification that the vector Uj 
prefers to align with the vector uj (see definition in footnote 19). 

FIG. 2. Typical folding scenarios for a helical segment of length N = 49: (a) Multiple nucleation 
and (b) single nucleation 

FIG. 3. Scaling of average folding time vs. polymer length, for helical segments (squres) and 
helical segments with two tethered segments (circles). 
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TABLES 

TABLE I. Folding time table. N is the number of helix forming residues, t m f p is the average 
first passage time, and % DNF, is the percentage that did not fold. In model I, the maximum 
folding time allowed is 10, 20, 50, 100, 150, 200 (xl0 6 MC steps) and in model II and III, 25, 50, 
100, 150, 200, 250 (xl0 6 MC steps) , for N = 19,25,31,37,43,49, respectively. 
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Figure 2 
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